Intensity normalization in imaging mass spectrometry

ABSTRACT

The present invention relates generally to a species (analyte) separation and analysis system, for instance a spectrometry system, comprising a processor for receiving and processing signals from said its detector to remove undesirable variation or noise before further processing into a spectrum, whereby the processor is programmed by a novel program for a normalization preprocessing of the signals of said separation and analysis system.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation application of U.S. application Ser.No. 13/640,367 filed Oct. 10, 2012, which is a § 371 national stageentry of International Application No. PCT/BE2011/000022, filed Apr. 12,2011, which claims priority to Great Britain Patent Application No.1005959.0 filed Apr. 12, 2010, the entire contents of which areincorporated herein by reference.

BACKGROUND AND SUMMARY Background of the Invention A. Field of theInvention

The present invention relates generally to a system or apparatus adaptedto separate and quantitatively analyze a species (analyte), for instancea spectrometry system to measure the molecular content of a species in asample or in a matrix such as a tissue or a biofilm, the system orapparatus comprising a processor for receiving and processing signalsfrom its detector to remove undesirable variation or noise beforefurther processing into a spectrum, whereby the processor is programmedby a novel program for normalization preprocessing of the signals, whichdoes remarkably better than area-under-the-curve based approaches (suchas the standard TIC based approach, but also other measures such asroot-mean-square) in a separation and quantitative analysis system.Additionally, the intelligent differentiation built into the approachtypically outperforms approaches that depend on prior selection of asubset of variables (e.g. a mass range in mass spectrometry or anelution time window in chromatography), and automates this phase toavoid requiring user-supplied parameters or interaction.

Several documents are cited throughout the text of this specification.Each of the documents herein (including any manufacturer'sspecifications, instructions etc.) are hereby incorporated by reference;however, there is no admission that any document cited is indeed priorart of the present invention.

B. Description of the Related Art

In all fields of mass spectrometry, including mass spectral imaging,proper preprocessing of the acquired data enables obtaining a goodinterpretation of the measurements [M. Hilario, A. Kalousis, C.Pellegrini, and M. Mutter, Mass Spectrom Rev, vol. 25, no. 3, pp.409-49, 2006; R. Hussong and A. Hildebrandt, Methods Mol Biol, vol. 604,pp. 145-61, 2010; L. Nie, G. Wu, and W. Zhang, Crit Rev Biotechnol, vol.28, no. 4, pp. 297-307, 2008 and J. L. Norris, D. S. Cornett, J. A.Mobley, M. Andersson, E. H. Seeley, P. Chaurand, and R. M. Caprioli,“Processing maldi mass spectra to improve mass spectral direct tissueanalysis,” Int J Mass Spectrom, vol. 260, no. 2-3, pp. 212-221, February2007]. There is currently yet a need in the art for such properpreprocessing tools in particular for species with a high molecularcontent as is often the case with mass spectrometry and mass spectralimaging providing anatomical images or spectra with risk onmultiplicative noise. Overall, the goal of the preprocessing phase is tofilter undesirable influences from the raw measurements, and to providea cleaned-up data set for further downstream analysis.

Such preprocessing method will try to remove undesirable variation ornoise, in particular the causes of ion intensity noise from the massspectral measurements in preparation for direct human interpretation orhigher-level statistical analysis. Most preprocessing methods willattempt to counteract only a specific noise type, and as a result thepreprocessing phase of a study can entail various steps. Typicalexamples include: baseline correction: quantifying and removing thechemical noise background; calibration: projecting the m/z range onto aset of known calibrants; alignment: projecting several spectra onto acommon m/z scale; normalization: projecting peak heights from severalspectra onto a common intensity scale; smoothing/denoising: removing iondetector and data acquisition induced jitter or peak detection:converting a mass spectral profile to a discrete set of peaks.

Present invention provides a new method of normalization of ionintensities across different mass spectra, which overcomes the caveat ofundesirable variation or noise from the mass spectral measurements. Thisprovides the ability to compare peak heights from one mass spectrum toanother, which is particularly important both for standard massspectrometry as well as mass spectral imaging. The invention also coversthe application of this form of normalization in related fields such aschromatography, whereby chromatographic peak height (for example inliquid chromatography or a hyphenated mass spectrometry setup) becomescomparable from one measurement to another.

The performance of this new procedure of present invention, calledIonization Efficiency Correction (IEC), is demonstrated in severalexamples of this application. A first example follows a commonexperimental design in the field of biomarker discovery, in which massspectrometry is used to compare the content of different samples. Thedata set is synthetically generated to provide a gold standard againstwhich the algorithm's performance can be weighed. The second example isa mass spectral imaging experiment on a sagittal section of mouse brain,and highlights the value of the new approach from an imaging standpoint.

SUMMARY OF THE INVENTION

Some embodiments of the invention are set forth (in claim format)directly below:

In an embodiment, the present invention relates to an apparatus adaptedto separate and quantitatively analyze a species, the apparatuscomprising a processor adapted to receive and process signals from saidits detector to remove undesirable variation or noise before furtherprocessing into a spectrum, characterized in that the processor isprogrammed for a normalization preprocessing of the signals of saidapparatus, whereby the normalization process comprising the steps of

a. Providing a data set of multiple spectra to normalize a givenspectrum to. (The data set can be a complete experiment or a subset ofan experiment.)

b. Separating the part common to all spectra from the parts that aredifferential.

c. Identifying which parts of the relative profiles of all these spectraare commonly found across the entire data set.

d. For each spectrum, calculating its common ion current (CIC), which isthe sum of all ion counts only belonging to the part of the spectrumthat is common in relative profile to other spectra in a data set.

e. For each spectrum, scale back the spectrum with the inverse of itsCIC or a CIC-derived scaling factor.

In another embodiment, the present invention relates to an apparatusadapted to separate and quantitatively analyze a species, comprising aprocessor for receiving and processing signals from said its detectorcharacterized in that to remove undesirable variation or noise beforefurther processing into a spectrum the processor is programmed for anormalization preprocessing of the signals of said apparatus, wherebythe normalization process comprising the steps of

a. Providing a data set of multiple spectra to normalize a givenspectrum to. (The data set can be a complete experiment or a subset ofan experiment.)

b. Identifying which parts of the relative profiles of all these spectraare commonly found across the entire data set.

c. Obtaining the estimate for the CIC to separate the sum of all ioncounts belonging to the part of the spectrum that is common in relativeprofile to other spectra in a data set (common ion current (CIC)) fromthe sum of all ion counts belonging to the part of the spectrum that isnot common in relative profile to other spectra in a data set.

d. For each spectrum, calculate its common ion current (CIC), which isthe sum of all ion counts belonging to the part of the spectrum that iscommon in relative profile to other spectra in a data set.

e. For each spectrum, scale back the spectrum with the inverse of itsCIC or a CIC-derived scaling factor.

Another embodiment of the present invention comprises an apparatusadapted to separate and quantitatively analyze a species, comprising aprocessor for receiving and processing signals from said its detectorcharacterized in that to remove undesirable variation or noise beforefurther processing into a spectrum the processor is programmed for anormalization preprocessing of the signals of said apparatus, wherebythe normalization process comprising the steps of

a. Providing a data set of multiple spectra to normalize a givenspectrum to. (The data set can be a complete experiment or a subset ofan experiment.)

b. Identifying which parts of the relative profiles of all these spectraare commonly found across the entire data set.

c. By a decomposition algorithm extracting from a collection of spectraa single pseudo-spectrum that only contains common ion peaks andrelative peak heights

d. For each spectrum, calculate area-under-the-curve of the scaledcommon profile (common ion current (CIC))

e. For each spectrum, scale back the spectrum with the inverse of itsCIC or a CIC-derived scaling factor.

In another embodiment, the present invention relates to an apparatusadapted to separate and quantitatively analyze a species, comprising aprocessor for receiving and processing signals from said its detectorcharacterized in that to remove undesirable variation or noise beforefurther processing into a spectrum the processor is programmed for anormalization preprocessing of the signals of said apparatus, wherebythe normalization process comprising the steps of

a. Providing a data set of N spectra that each contain M m/z bins

b. Searching for a rank-1 approximation of the two-mode array or matrixcontaining all the spectra by organizing a rank-1 approximation of the NM data matrix, while penalizing differential peaks in the profilevector.

c. Generating a 1 M vector containing the common spectral profile and aN 1 vector containing scaling factors

d. Using the scaling factors to calculate area-under-the-curve of thescaled common profile (common ion current (CIC)).

e. For each spectrum, scaling back the spectrum with the inverse ofthese scaling factors or a derivation thereof.

In yet another embodiment, the present invention relates to an apparatusadapted to separate and quantitatively analyze a species, comprising aprocessor for receiving and processing signals from said its detectorcharacterized in that to remove undesirable variation or noise beforefurther processing into a spectrum the processor is programmed for anormalization preprocessing of the signals of said apparatus, wherebythe normalization process comprising the steps of

a. Providing a data set of N spectra that each contain M m z bins

b. A non-negative matrix factorization (NMF) algorithm is run severaltimes on the data set in rank-1 mode, but each iteration thedifferential residuals are deducted from the data set.

c. Generating a 1 M vector containing the com mon spectral profile and aN 1 vector containing scaling factors

d. Using the scaling factors to calculate area-under-the-curve of thescaled common profile (common ion current (CIC)).

e. For each spectrum, scaling back the spectrum with the inverse ofthese scaling factors or a derivation thereof.

Another aspect of the present invention relates to an apparatus adaptedto separate and quantitatively analyze a species, comprising a processorfor receiving and processing signals from said its detectorcharacterized in that to remove undesirable variation or noise beforefurther processing into a spectrum the processor is programmed for anormalization preprocessing of the signals of said apparatus, wherebythe normalization process comprising the steps of

a. Establishing a pseudo-spectrum of the common peaks and generating ascaling factor for each individual spectrum to separate the common ioncounts from the differential ion counts.

b. Estimating the CIC of a spectrum as the area-under-the-curve of thecommon profile (determined in step one), scaled by that spectrum'sindividual scaling factor (also determined in step one).

c. Scaling back the entire spectrum, not just the common parts, with theinverse of the CIC or a derivation thereof.

1. The apparatus according to any one of the embodiments disclosedabove, can be characterized in that the processor is programmed for anormalization preprocessing of the signals of said apparatus, wherebythe normalization process is without a total ion current (TIC)-basednormalization step to assure that that ionization efficiency arecompared and rectified on the basis of the parts that are common betweentwo spectra and not on the basis of the parts that are differential orcan be characterized in that the processor is programmed for anormalization preprocessing of the signals of said apparatus, wherebythe normalization process is without a total ion current (TIC)-basednormalization step to assure that that only ion counts from analytescommon to all spectra are used to calculate the normalization.

The apparatus according to any one of the embodiments referred tohereabove is in a particular embodiment adapted to measure the molecularcontent of species in a carrier for instance in a tissue.

As additional particular feature, the apparatus according to any one ofthe embodiments the normalization is an ionization efficiencycorrection; is a chromatography system; is a molecular chromatographysystem; is a chromatography-spectroscopy system; is an ionizationmeasurement apparatus; is a spectrometer; is a mass spectrometer; is anion mass spectrometer or is a spectrometer, whereby the processor toreceive and process signals (e.g. current signals) from the iondetection means, whereby for instance the signals are processed intoinformation that demonstrates relative current produced by ions(relative abundance or relative intensity) in relation to varyingmass/charge ratios. Moreover, such spectrometer can comprise anelectronic detection means for ion detection (the detector), and furthercomprises a means for desorption or vaporization, an ionization means(the ion source) and an ion acceleration means with ion separation ordeflection means to separate ions, for instance according to their massand charge (the mass analyzer). An another feature of present inventioncan be that the spectrometer comprises 1) an ion source for ionizing aspecimen (e.g. a vaporized sample) to generate ions (e.g. to convert gasphase sample molecules into ions), 2) an ion sorting means, the socalled mass or ion mobility analyser, resp. for sorting and separatingions according to their mass and charge or their mobility, whichcomprises an ion transport portion for transporting the ions (e.g. byacceleration in an electric or magnetic field) with a mass or mobilityselection and/or analysing means for calculation of the m/z or mobilityratios based the detailed motion of the ions passing the field (e.g. atime-of-flight analyzer, (linear) quadrupole mass analyzer, quadrupoleion trap or orbitrap available in the art); 3) a detector, optionallyforeseen with an amplifier, for recording either charge induced orcurrent produced when an ion passes by or hits a surface 4) a processorfor receiving and processing signals from said detector and 5)optionally a screen to display the mass spectrometric measurements.

An embodiment of the present invention relates to any apparatusaccording to any one of the previous embodiments, which comprises astorage means to store the processed signal electronically or which adisplay means to display relative abundance or intensity of ion with aspecific mass-to-charge ratio (m/z) in peaks on a graphic (the massspectrum).

Yet another embodiment of present invention is the apparatus accordingto any one of the previous embodiments for use in a diagnostic medicaltreatment of a subject to diagnose for or visualize a disorder.

Yet another embodiment of present invention is the apparatus accordingto any one of the previous embodiments for use in a diagnostic medicaltreatment of a subject to diagnose for relative peak height changesrepresentative for a disease state or condition through classification.Such the apparatus according to any one of the previous embodiments canbe used for analyzing high molecular content species such as tissues,biofilms, and complex molecules.

Other aspects of the present invention relate to various uses such as:

use of the apparatus according to any one of the previous embodimentsfor discovering new biomarkers;

use of the apparatus according to any one of the previous embodimentswhen operational in its processor to filter undesirable influences fromits raw measurements and to provide a cleaned-up data set for furtherdownstream analysis;

use of the apparatus according to any one of the previous embodimentswhen operational in its processor to remove undesirable variation ornoise, in particular to remove the causes of ion intensity noise fromthe (mass) spectral measurements;

use of the apparatus according to any one of the previous embodimentswhen operational in its processor to improve processing intointerpretable measurements;

the use of the apparatus according to any one of the previousembodiments when operational in its processor to decrease or minimizeinfluences other than abundance;

use of the apparatus according to any one of the previous embodimentswhen operational in its processor to increase the reliability of peakheights, comparable across different mass spectra or measurements;

use of the apparatus according to any one of the previous embodimentswhen operational in its processor to linearize the physical relationshipbetween a amount of a particular molecular species and the peak heightrecorded at a certain mass-overcharge value for species with a molecularcontent;

use of the apparatus according to any one of the previous embodimentswhen operational in its processor to decrease noise factors that perturbpeak height, such as wet lab factors (e.g. differences in samplepreparation), instrument factors (e.g. ionization efficiency and iondetector saturation), ion intensity noise factors which aremolecule-specific, or noise factors that have a global effect across theentire mass range (e.g. variation in the concentration of matrixcrystals), whereby the noise factors can be global noise factors andeventually without a pre-estimated estimate of the “noise scalingfactor”;

use according to the apparatus according to any one of the previousembodiments when operational in its processor to remove undesirablevariation or noise before further processing into a spectrum or graph;

use according to the apparatus according to any one of the previousembodiments when operational in its processor to normalize the ionintensities and make peak heights comparable from one mass spectrum toanother;

use according to the apparatus according to any one of the previousembodiments when operational in its processor to identify the presenceor absence of an ion species, and/or to quantify said ions (in order tocompare said ion species in a certain tissue sample with another tissue;

use according to the apparatus according to any one of the previousembodiments when operational in its processor to remove disturbancefactors on the abundance to improve comparison between spectra of aspecies or a sample with the peak heights directly representing ionabundance and indirectly the concentration of an analyte;

use according to the apparatus according to any one of the previousembodiments when operational in its processor to make peak heights (ionintensities) comparable from one spectrum to the next;

use according to the apparatus according to any one of the previousembodiments when operational in its processor to make peak intensitiescomparable from one pixel to another in mass spectral imaging or imagingmass spectrometry;

use according to the spectrometer according to any one of the previousembodiments when operational in its processor to analyze a samplecontaining biomolecules (or otherwise useful molecules) and to comparesuch molecules and their distribution across various samples.

use according to the apparatus according to any one of the previousembodiments when operational in its processor to chart the variation inprotein content and distribution associated with disease; and/or

use according to the apparatus according to any one of the previousembodiments when operational in its processor to remove multiplicativenoise that cannot be transformed through for example log-transformation.

Another embodiment of the present invention comprises a processor whichis programmed for a normalization of “chromatographic” style outputsignals where the signal represents a collection of peaks distributedacross a x/y scale, where the peak heights are proportional to theconcentration/abundance/intensity of a measured event, for instanceoutput signals of an apparatus of the group consisting of liquidchromatograph (LC), gas chromatograph (GC) and densitometric scanner orof the method of the group consisting of liquid chromatography (LC), gaschromatography (GC) and densitometric scanning, whereby thenormalization process comprising the steps of

a. Providing a data set of multiple measurements to normalize a givenmeasurement to. (The d ta set can be a complete experiment or a subsetof an experiment.)

b. Identifying which parts of the relative profiles of all thesemeasurements are commonly found across the entire data set.

c. By a decomposition algorithm extracting from a collection ofmeasurements a single pseudo-measurement that only contains common peaksand relative peak heights

d. For each spectrum, calculate area-under-the-curve of the scaledcommon profile (common area-under-the-curve (CAUC))

e. For each chromatogram or measurement, scale back the its values withthe inverse of its CAUC or a derivation thereof.

In another embodiment, the present invention relates to a method ofdiagnosing of a disorder or biological abnormality, characterized inthat the method comprises processing of a plurality of variablesobtainable from assaying of spectroscopic images or profiles of apatient, whereby the method comprises normalization preprocessing ofsignals of said spectrometer, whereby the normalization processcomprising the steps of

a. Providing a data set of multiple spectra to normalize a givenspectrum to. (The data set can be a complete experiment or a subset ofan experiment.)

b. Separating the part common to all spectra from the parts that aredifferential.

c. For each spectrum, calculating its common ion current (CIC), which isthe sum of all spectroscopic (usually ion) counts belonging to the partof the spectrum that is common in relative profile to other spectra in adata set.

d. For each spectrum, scale back the spectrum with the inverse of itsCIC or derived measure thereof.

In a particular embodiment of the present invention relates to anoperating system for operating the methods according to any one of theprevious embodiments mentioned herebove which controls the allocation ofan essay system to generate biomarker values of a patient and whichfeeds the input signals from the essay system into a signal processorcomprising a mathematical model that is described on the relationship ofa plurality of biomarker variables and a plurality of disorder variablesfrom assaying of biological samples of a plurality of patients with nodisorder, affected with disorder, affected with a defined seriousness orwith defined progress of disorder. Such operating system can be fordetermining the presence or absence of disorder, the seriousness ofdisorder or the progress of disorder in the patient according to any oneof the previous embodiments.

An additional feature is that the operating system according to any oneof the embodiments also controls usage of the essay system.

As yet another additional feature, the operating system according to anyone of the embodiments includes a user interface to enable the user tointeract with the functionality of the computer.

As yet another additional feature, the operating system according to anyone of the embodiments includes a graphical user interface whereby theoperating system controls the ability to generate graphics on thecomputer's display device that can be displayed in a variety of mannersrepresentative for or associated with the condition of disorder in aselected patient or a group of patients to allow a user to distinguishbetween the absence of disorder, the seriousness of disorder or theprogress of disorder in identified patients or patient groups.

Yet another, embodiment of present invention concerns acomputer-executable code, stored in a computer-readable medium, the isadapted, when running on a computer system, to run the operating systemaccording to any one of the embodiments mentioned above or to executethe model described in any of the embodiments mentioned above, and todirect a processing means to produce output signals that arerepresentative for a condition of disorder or a modifying condition ofdisorder.

DETAILED DESCRIPTION Detailed Description of Embodiments of theInvention

The following detailed description of the invention refers to theaccompanying drawings. The same reference numbers in different drawingsidentify the same or similar elements. Also, the following detaileddescription does not limit the invention. Instead, the scope of theinvention is defined by the appended claims and equivalents thereof.

Definitions

“m/z” is mass over charge ratio; PCA is principal component analysis [I.T. Jolliffe, Principal component analysis, 2nd ed. New York: Springer,2002. and M. Ringner, “What is principal component analysis?” NatBiotechnol, vol. 26, no. 3, pp. 303-4, March 2008.]

An “assay” in the meaning of this application is an analysis orprocedure to determine the presence or absence of one or more molecularspecies in an organism or an organic sample. A quantitative assay alsomeasures the quantity of its target analyte in the sample.

The “total ion current” in the meaning of present invention is the sumof the separate ion currents carried by the different ions contributingto the spectrum [A. D. McNaught and A. Wilkinson, Compendium of chemicalterminology: IUPAC recommendations, 2nd ed. Oxford: Blackwell Science,1997. [Online]. Available: goldbook.iupac.org/index.html]. From amathematical standpoint, the sum of all ion counts in a mass spectrumirrespective of ion species, or the integral over the mass spectralprofile.

“Ionization efficiency” in the meaning of this application is the ratioof the number of ions formed to the number of electrons or photons usedin an ionization process [A. D. McNaught and A. Wilkinson, Compendium ofchemical terminology: IUPAC recommendations, 2nd ed. Oxford: BlackwellScience, 1997. [Online]. Available: goldbook.iupac.org/index.html].

In this application a “mass” or “m/z” means” a mass to charge ratio, anda “mass range” or a “m z range” means a range for the mass to chargeratio. A linear dynamic range is the range over which an ion signal isin a linear to the corresponding analyte concentration. Mass accuracy isthe ratio of the m/z measurement error to the true m z. The massresolving power is the measurement of the ability to distinguish twopeaks of slightly different m/z.

Spectrometry is the spectroscopic technique used to assess theconcentration or amount of a given chemical (atomic, molecular, orionic) species. In this case, the instrument that performs suchmeasurements is a spectrometer, spectrophotometer, or spectrograph.

A mass spectrometer is an apparatus for the determination of theelemental composition of an analyte sample or molecule and/or forelucidating the chemical structures of molecules, such as peptides andother chemical compounds. The mass spectrometry principle consists ofionizing chemical compounds of an analyte to generate charged moleculesor molecule fragments, transporting such ions by a potential (e.g. underan either static or dynamic magnetic or electric field) and measurementof their mass-to-charge (m/z) ratios.

A species in the meaning of this application is a particular analyte,molecule or chemical (atomic, molecular, or ionic). It can for instanceconcerns peptides, polynucleotides, small molecules, lipoproteins.

A mass spectrometer for proteomics briefly is an apparatus that ionizesvaporized or desorped samples to generate charged molecules or moleculefragments and that measures their mass-to-charge ratios. Typically suchmass spectrometer includes: 1) an ion source for ionizing a specimen(e.g. a vaporized sample) to generate ions (e.g. to convert gas phasesample molecules into ions), 2) an ion sorting means, the so called massanalyser, for sorting and separating ions according to their mass andcharge which comprises an ion transport portion for transporting theions (e.g. by acceleration in an electric or magnetic field) with a massselection and/or analysing means for computation of the m/z ratios basedon the detailed motion of the ions passing through the field (e.g. atime-of-flight. analyzer, (linear) quadrupole mass analyzer, quadrupoleion trap or orbitrap available in the art); 3) a detector, optionallyforeseen with an amplifier, for recording either charge induced orcurrent produced when an ion passes by or hits a surface; 4) a processorfor receiving an and processing signals from said detector and 5)optionally a screen to display the mass spectrometric measurements.

“Electrospray ionization” (ESI) is a technique used in mass spectrometryto produce ions. It is especially useful in producing ions frommacromolecules because it overcomes the propensity of these molecules tofragment when ionized. Mass spectrometry using ESI is calledelectrospray ionization mass spectrometry (ESI-MS) or, less commonly,electrospray mass spectrometry (ES-MS). Electrospray ionization is theion source of choice to couple liquid chromatography with massspectrometry. The analysis can be performed online, by feeding theliquid eluting from the LC column directly to an electrospray, oroffline, by collecting fractions to be later analyzed in a classicalnanoelectrospray-mass spectrometry setup.

“Matrix-assisted laser desorption ionization” (MALDI) is a techniqueused in mass spectrometry to produce ions. It is especially useful inproducing ions from macromolecules because it overcomes the propensityof these molecules to fragment when ionized by embedding the moleculesinto a ‘matrix’ of chemical crystals that adsorb some of the impactenergy from the laser. It is of particular interest with regard toapplications that employ some form of surface chemistry, and its abilityto retain the spatial origin of the measurements makes it well suitedfor molecular imaging approaches such as MALDI based mass spectralimaging, also known as imaging mass spectrometry.

“High molecular content”. Tissues, bio films, and complex molecules havean inherent and high/complex molecular content. Imaging massspectrometry is a mass spectrometry based methods that can be directlyapplied to a tissue or to tissues to measure its molecular content. Ahigh molecular content in the meaning of imaging mass spectrometry canbe the parallel analysis of hundreds of biomolecules, exquisitesensitivity, qualitative and quantitative analysis, and the ability todistinguish between close variants and/or the simultaneously analyze thedistribution of hundreds of such biomolecules. This can be enforced withHigh throughput imaging MS: for instance a Bruker UltrafleXtreme highspeed mass spectrometer enables clinical tissue arrays to be analyzed atcellular resolution and thus each tissue to be described, analyzed andclassified according to its molecular content. Furthermore Ultrahighmass resolution imaging MS provide the possibility to distinguish lipidsand metabolites which have almost identical masses. For instance theultra high mass resolution of a 9.4 T Fourier transform ion cyclotronresonance mass spectrometer can distinguish between these ions and thusallows the distributions of many lipids and metabolites to besimultaneously measured. These instruments provide rich datasets andintegrate the results with established single molecule molecular imagingtechnologies.

Preprocessing for Normalization

Preprocessing of signals of a spectrometer, in particular a massspectrometer, aims at removing undesirable variation or noise beforefurther processing into a spectrum. One of the primary preprocessingsteps in mass spectrometry is normalization. The goal of a normalizationprocedure is to normalize the ion intensities and make peak heightscomparable from one mass spectrum to another. Many applications of massspectrometry require information not only on the presence or absence ofan ion species, but they also require some indication of quantityregarding those ions. As there is a relationship between theconcentration of an analyte and its ion count as reported in a massspectrum, peak heights can serve as indicators of quantity. However, thereliable use of peak heights depends on whether influences other thanabundance can be minimized. The need for reliable peak heights,comparable across different mass spectra, spans a very wide range ofbiochemical applications. In qualitative analyses aiming to understandthe content of a sample, peak height is sometimes used to establish anindication of confidence, by enabling the calculation of asignal-to-noise ratio (SNR) for each ion species under consideration.Qualitative analyses are typically found in areas such as proteinidentification [M. Kinter and N. E. Sherman, New York: John Wiley, 2000;B. Lu, A. Motoyama, C. Ruse, J. Venable, and J. R. Yates, 3rd, AnalChem, vol. 80, no. 6, pp. 2018-25, March 2008; L. Martens and R.Apweiler, Methods Mol Biol, vol. 564, pp. 245-59, 2009, J. Stauber, L.MacAleese, J. Franck, E. Claude, M. Snel, B. K. Kaletas, I. M. V. D.Wiel, M. Wisztorski, I. Fournier, and R. M. A. Heeren J Am Soc MassSpectrom, vol. 21, no. 3, pp. 338-47, March 2010 and A. R. Farley and A.J. Link, Methods Enzymol, vol. 463, pp. 725-63, 2009]. and the searchfor post-translational modifications [A. R. Farley and A. J. Link,“Identification and quantification of protein posttranslationalmodifications,” Methods Enzymol, vol. 463, pp. 725-63, 2009 and N. L.Young, M. D. Plazas-Mayorca, and B. A. Garcia, “Systems-wide proteomiccharacterization of combinatorial post-translational modificationpatterns,” Expert Rev Proteomics, vol. 7, no. 1, pp. 79-92, February2010.] In quantitative applications, peak height as an indicator ofabundance lies central to the analysis. Quantitative analyses span amultitude of approaches ranging from isotope labeling to label-freemethods, and from absolute quantification to relative profiling. Anexample of absolute quantification is the use of mass spectrometry as apharmacokinetic assay, tying an absolute peak height to a certainconcentration of the target analyte per unit volume or mass of sample[M. W. Duncan, P. J. Gale, and A. L. Yergey, The principles ofquantitative mass spectrometry, 1st ed. Denver, Colo.: RockpoolProductions, 2006; M. W. Duncan, H. Roder, and S. W. Hunsucker,“Quantitative matrix-assisted laser desorption/ionization massspectrometry,” Brief Funct Genomic Proteomic, vol 7, no. 5, pp. 355-70,September 2008; H. Humbert, M. D. Cabiac, J. Barradas, and C. Gerbeau,“Evaluation of pharmacokinetic studies: is it useful to take intoaccount concentrations below the limit of quantification?” Pharm Res,vol. 13, no. 6, pp. 839-45, June 1996; G. Liebisch, M. Binder, R.Schifferer, T. Langmann, B. Schulz, and G. Schmitz, “High throughputquantification of cholesterol and cholesteryl ester by electrosprayionization tandem mass spectrometry (esi-ms/ms),” Biochim Biophys Acta,vol. 1761, no. 1, pp. 121-8, January 2006 and D. Mims and D. Hercules,“Quantification of bile acids directly from plasma by maldi-tof-ms,”Anal Bioanal Chem, vol. 378, no. 5, pp. 1322-6, March 2004]. Mostquantitative applications of mass spectrometry however are of thebiomarker profiling type. These do not ascribe a meaning to absolutepeak heights, but look rather for relative peak height changes that canbe tied to a particular disease state or condition throughclassification [M. W. Duncan, H. Roder, and S. W. Hunsucker,“Quantitative matrix-assisted laser desorption/ionization massspectrometry,” Brief Funct Genomic Proteomic, vol. 7, no. 5, pp. 355-70,September 2008., N. G. Ahn, J. B. Shabb, W. M. Old, and K. A. Resing,“Achieving in-depth proteomics profiling by mass spectrometry,” ACS ChemBiol, vol. 2, no. 1, pp. 39-52, January 2007, P. C. Carvalho, J. Hewel,V. C. Barbosa, and J. R. Yates, 3rd, “Identifying differences in proteinexpression levels by spectral counting and feature selection,” Genet MolRes, vol. 7, no. 2, pp. 342-56, 2008]. One particular area in which peakheights need to be directly compared from one spectrum to the next ismass spectral imaging. An ion image produced from a MSI experiment issimply a false color representation of peak height across an organictissue section. The need for comparable peak intensities from one pixelto another is therefore readily apparent.

The Nature of Ion Intensity Noise:

A mass spectrometer establishes a physical relationship between aparticular molecular species and the peak height recorded at a certainmass-over-charge value. In general, quantity is one of the mostimportant factors in this relationship, meaning that a larger amount ofmolecules usually results in a larger ion count at the correspondingmass-over-charge bin. However, the link is rarely if ever as simple andas linear as that. In fact, peak height can be perturbed by wet labfactors such as differences in sample preparation and sample content [J.Franck, K. Arafah, A. Barnes, M. Wisztorski, M. Salzet, and I. Fournier,“Improving tissue preparation for matrix-assisted laser desorptionionization mass spectrometry imaging, part 1: using microspotting,” AnalChem, vol. 81, no. 19, pp. 8193-202, October 2009.]. It can also beinfluenced by instrument factors such as ionization efficiency [F.Hillenkamp, M. Karas, R. C. Beavis, and B. T. Chait, “Matrix-assistedlaser desorption/ionization mass spectrometry of biopolymers,” AnalChem, vol. 63, no. 24, pp. 1193A-1203A, December 1991] and ion detectorsaturation. In the case of mass spectral imaging additional factors areintroduced such as the topology and texture of the tissue, and thevariation of matrix coating across the section. The best strategy is tominimize these noise factors on the wet lab side by taking care to keepall experimental parameters constant from one measurement to the next. Agood example of such efforts includes the matrix deposition in MSIexperiments, often performed by robotic spotters in an effort to putdown as homogeneous a matrix coating as possible [J. Franck, K. Arafah,A. Barnes, M. Wisztorski, M. Salzet, and I. Fournier, “Improving tissuepreparation for matrix-assisted laser desorption ionization massspectrometry imaging, part 1: using microspotting,” Anal Chem, vol. 81,no. 19, pp. 8193-202, October 2009]. In practice however, some of therelevant parameters cannot be controlled to the extent necessary to doaway with ion intensity noise. Compensation for this unavoidable noisetype will therefore fall to computational methods on the in silico side.Some of the ion intensity noise factors are molecule-specific, and theirinfluence is therefore local to a particular m/z area or bin (e.g. anion overshadowed in the detector by a more abundant co-ionizing ion ofnearby mass, or ions that due to conformational reasons which are notvery inclined to ionize). Other noise factors have a global effectacross the entire mass range (e.g. variation in the concentration ofmatrix crystals). The molecule-specific factors usually pose fewproblems for inter-spectrum comparisons as long as the same ion speciesis being compared across all spectra. More precisely, the goal of adifferential analysis between spectra is not to compare abundance fromone ion species to another species located elsewhere on the m/z scale,but rather to compare abundance of the same ion species from one sampleto the next. This means that as long as the molecule-specific factorsare kept the same across the different mass spectra, by keeping theexperimental parameters as constant as possible, these effects usuallyneed not be explicitly removed. The effect of the global noise factorshowever is usually much more extensive and can rarely be leftunadjusted. This is why most normalization procedures in massspectrometry target mass range-wide intensity noise. An explicitenumeration of the various physical noise effects in the ion source, themass analyzer, and the ion detector would be a difficult endeavour asmany of these processes are sometimes not yet fully understood, whileothers can be described with only the most elaborate of mathematicalmodels [R. Knochenmuss and R. Zenobi, “Maldi ionization: the role ofin-plume processes,” Chem Rev, vol. 103, no. 2, pp. 441-52, February2003]. Instead of attempting to model each effect explicitly,normalization methods in mass spectrometry take an empirical stance,modeling global ion intensity noise simply as a straightforward linearscaling factor across the entire mass spectral profile. Although thereis something to be said for more elaborate nonlinear noise models alongthe m/z axis, there are often insufficient clues as to the real ionintensities to fit a model more complex than a global scaling. Theassumption of a global scaling due to noise, and counteracting it with areverse scaling factor, has empirically been shown to give good results.One of global scaling's strong points is that in most cases it capturesthe bulk of the ion intensity noise, while at the same time avoidingoverfitting. The problem of overfitting a noise model in massspectrometry is not trivial. Usually there is very little informationavailable external to the measurement (unless well-characterizedcalibrants are spiked into the sample, which is often unpractical).Additionally, most general mass spectrometry studies have aninsufficient number of replicate measurements to reliably generalizefrom. Some more advanced models have been formulated [Y. V. Karpievitch,T. Tavener, J. N. Adkins, S. J. Callister, G. A. Anderson, R. D. Smith,and A. R. Dabney, “Normalization of peak intensities in bottom-up msbased proteomics using singular value decomposition,” Bioinformatics,vol. 25, no. 19, pp. 2573-80, October 2009 and M. K. Kerr, M. Martin,and G. A. Churchill, “Analysis of variance for gene expressionmicroarray data,” J Comput Biol, vol. 7, no. 6, pp. 819-37, 2000], butin general a global scaling factor remains the standard model for ionintensity noise.

Given the MALDI-based nature of the imaging experiments described inthis document, it serves to mention that any mass spectrometryexperiment using this type of ionization is particularly prone to ionintensity noise, making a normalization preprocessing step practically aprerequisite. The reason for the intensity noise lies in the use ofmatrix molecules to enable ionization. In MALDI-based measurements,analytes need to be embedded into matrix crystals to keep them intactduring the laser-induced desorption and ionization phase. It is clearthat in such a setup the number of analyte ions that are formed in theion source, will not only be dependent on the amount of analyte present,but also on the amount of matrix crystals that are present. However,growing crystals on an analyte in a sample well and later repeating thatprocess on another sample in another well, while trying to obtain thesame concentration of crystals, is not an easy task. In the MALDI massspectrometry field substantial research has gone into improving thematrix molecules [F. Hillenkamp, M. Karas, R. C. Beavis, and B. T.Chait, “Matrix-assisted laser “desorption/ionization mass spectrometryof biopolymers,” Anal Chem, vol. 63, no. 24, pp. 1193A-1203A, December1991, C. Meriaux, J. Franck, M. Wisztorski, M. Salzet, and I. Fournier,“Liquid ionic matrixes for maldi mass spectrometry imaging of lipids,” JProteomics, February 2010 and M. Mank, B. Stahl, and G. Boehm,“2,5-dihydroxybenzoic acid butylamine and other ionic liquid matrixesfor enhanced maldi-ms analysis of biomolecules,” Anal Chem, vol. 76, no.10, pp. 2938-50, May 2004], achieving more reproducible matrixdeposition], and understanding the topic of matrix hotspots [J. Franck,K. Arafah, A. Barnes, M. Wisztorski, M. Salzet, and I. Fournier,“Improving tissue preparation for matrix-assisted laser desorptionionization mass spectrometry imaging, part I: using microspotting,” AnalChem, vol. 81, no. 19, pp. 8193-202, October 2009]. Hotspots are certainareas in the sample showing better crystallization and therefore higherion intensity and signal-to-noise ratio. Although significant progresshas been made, the reproducibility of matrix conditions remains a pointof attention in MALDI-based research, be it standard as well asimaging-oriented. MSI adds to the matrix difficulties with additionaleffects from the tissue layer, from which the analyte needs to bedesorbed and on which the matrix needs to crystallize. This means thatin MSI experiments it is not uncommon to see ion intensity not onlyinfluenced by matrix conditions, but also by the particular cell typefrom which the measurement is taken. Both the medium as well as thequality of the matrix embedding cause amplification or attenuation ofion formation in the source.

State of the Art Approach, Total Ion Current:

The common normalization methods in mass spectrometry operate on theglobal scaling assumption mentioned above. These algorithms consider ionintensity noise to be an undesirable scaling factor, which is differentfor every mass spectrum. The remedy seems evident: rescale the noisyspectrum with a scaling factor multiplicatively inverse to the noisescaling factor. The problem thus presents itself as a two-stepprocedure:

1. Find an estimate of the noise scaling factor.

2. Scale back the mass spectrum with the inverse scale factor.

The problem is that the noise scaling factor is unknown, and the onlyinformation available is the noisy spectrum. Unless external informationis provided regarding the true ion intensities of the spectrum (e.g. anexternally calibrated peak intensity), the algorithm has little clues onwhich to base its estimate in step 1.

The problem definition above describes the situation where the goal isto remove the ion intensity noise from the mass spectrum altogether.However, most experimental setups that include a cross-comparisonbetween mass spectra only seek relative peak height changes. Asmentioned in the introduction, normalization aims to project peakheights from several spectra onto a common intensity scale to allowrelative comparison.

Whether or not that common intensity scale is the true intensity scalein the absence of noise, is irrelevant to most studies. The onlyrequirement for relative comparisons of peak height is to establish acommon ground between the spectra to scale towards. This common groundcan be any measure that connects the ion intensities of the differentspectra together. Various schemes have been suggested, particularly inthe context of mass spectral search algorithms. One naive example isbase peak normalization [G. Rasmussen and T. Isenhour, “The evaluationof mass spectral search algorithms,” J. Chem. Inf. Comput. Sci, vol. 19,no. 3, pp. 179-186, 1979], where spectra are scaled relative to eachother such that their highest peak is equally high in all spectra.Another example is a method based on total ion current, which has beenshown to produce much better results [G. Rasmussen and T. Isenhour, “Theevaluation of mass spectral search algorithms,” J. Chem. Inf. Comput.Sci, vol. 19, no. 3, pp. 179-186, 1979, Z. B. Alfassi, “On thenormalization of a mass spectrum for comparison of two spectra,” J AmSoc Mass Spectrom, vol. 15, no. 3, pp. 385-7, March 2004]. The total ioncurrent (TIC) of a mass spectrum is the sum of the separate ion currentscarried by the different ions contributing to the spectrum {A. D.McNaught and A. Wilkinson, Compendium of chemical terminology: IUPACrecommendations, 2nd ed. Oxford: Blackwell Science, 1997.] Inmathematical terms, the TIC can be considered the sum of all ion countscollected in a mass spectrum, or the integral over the mass spectralprofile. Scaling mass spectra such that they have the same TIC hasbecome an ad hoc norm for normalization in many areas of massspectrometry. For instance some use ProTS Data software (EfecktaTechnologies, Inc.) for baseline substraction scaling of the spectra toa total ion current (TIC) based normalization (Riehen A. A. et alUS2007/00691222). The rationale behind scaling towards a common TICvalue makes physical sense. Distinct experiments, but executed withidentical experimental parameters (e.g. laser intensity, sample amounts,. . . ), should arguably yield similar amounts of ions.

Most TIC-based normalization algorithms aimed at enabling relativecomparisons, consist of these two steps:

1. For each mass spectrum, calculate its TIC

2. Scale back the mass spectrum with the inverse of its TIC.

Step 2 will scale all spectra to a TIC of one. Some algorithms howeverwill scale towards a common TIC value (e.g. the median TIC of allspectra) instead. One reason for this is interpretation, in the sensethat the normalized peak heights will be on a scale not too far removedfrom the original ion count values that were collected. Another reasonis numerical precision and memory requirements. Because of memory andcomputation considerations, some implementations are better served withan integer based ion count than real values between zero and one.Theoretically all these TIC based approaches are equivalent as they allretain the relative differences regardless of the absolute intensityscale of the normalization result.

FIG. 1.1 gives an example of how a TIC-based normalization works for thecomparison of two real mass spectra. The spectra come from a tissueprofiling experiment, and the normalization algorithm that is applied,is a standard TIC-based implementation called msnorm, provided with theBioinformatics Toolbox of MATLAB (The Math Works, Natick Mass., USA).The algorithm took 100 percent of the spectras'TIC into account tocalculate the resulting scaling factor. Although both spectra onlycontain positive values, one spectrum is shown as negative to enableeasy comparison. A schematic overview of the two-step procedure is alsoshown in FIG. 1.3 .

Problems with the State of the Art Total Current Approach:

Ionization efficiency is defined as the ratio of the number of ionsformed to the number of electrons or photons used in the ionizationprocess [A. D. McNaught and A. Wilkinson, Compendium of chemicalterminology: IUPAC recommendations, 2nd ed. Oxford: Blackwell Science,1997.]. If the laser intensity of a MALDI mass spectrometer is keptconstant from one measurement to the next, ionization efficiency equatesto the yield of ions formed in a mass spectral measurement. Scalingspectra to have the same ionization efficiency could therefore beconsidered scaling towards the same ion yield across all mass spectra.When all other parameters are kept the same, such an operation wouldindeed counteract the amplifying and attenuating effects presented insection 1.2. The key point however is that the statement “when all otherparameters are kept the same” includes the sample content. At firstglance it seems that if ionization efficiency at constant ionizationenergy is equal to ion yield, and if the total ion current isproportional to the ion yield hitting the detector, TIC can be a goodmeasure for ionization efficiency. This reasoning forms the rationalebehind most TIC-based normalization methods, and as a result thesemethods will try to equalize the total ion current across all spectra.The problem with TIC-based normalization methods is that this reasoningdoes not take into account differences in sample content, and theirscaling is therefore done on the basis of a measure that is onlypartially proportional to ionization efficiency. The reason that samplecontent plays a role is that differences in molecular content willproduce different peak patterns in the spectra. The ion counts tied intothe peaks that are differential between the spectra are added to theTIC, but in reality these ion counts are not the result of a change inionization efficiency, which is what the TIC is expected to report. Thepotential harm this wrong assumption holds, will depend on theparticulars of the measurements (e.g. How different are the peakpatterns from sample to sample?, How much of the TIC is differential?, .. . ). Although the repercussions might be negligible in some cases andthe use of TIC normalization is certainly better than no normalizationat all, the influence of differential peaks on TIC-based normalizationis almost always present. Most studies compare samples that differ inmolecular content. Often finding the differences is the reason forperforming the study in the first place, and the alternative wouldinvalidate any effort to look for biomarkers. In those cases theparticulars of the measurements will decide whether TIC-basednormalization will underperform. As an example, lets consider twosamples. Both contain the same amount of analyte A. Only the secondsample additionally contains analyte B in an amount similar to A. Ifboth samples are coated with matrix and measured using the same laserpower and under the same experimental conditions, you will normally getone peak of analyte A in the spectrum (considering molecular ions forthis example and no fragmentation ions) of sample one and two peaks (Aand B) in the spectrum of sample two. Even if the A peak is somewhatdiminished in height due to the presence of B, it is clear that aTIC-based inverse scaling of sample two would count both A and B ioncounts and could severely bias the height of peak A in sample twodownward.

By including the ion counts from differential peak B, the TIC of sampletwo is estimated two times higher than the TIC of sample one. The resultis that sample two is scaled down approximately twice too strongly. Theresult of a TIC-based normalization would be that peak A in sample twois only half as high as peak A in sample one, although they shouldrepresent the same quantity of analyte A. A graphic example of how TICcan steer normalization wrong is also shown in the schematic of FIG. 1.3. Some TIC-based methods can be made to compensate for the biassomewhat, by removing ion counts from peaks whose height falls within acertain user-defined quantile range or by using derived measures thatemphasize the weight of larger peaks in the final scaling factor (e.g.root-mean-square). However, the same problem remains as there are noreal rules of thumb for setting the value of the quantile parameter andthe same effect would still be happening, only focused on smaller peaks.To summarize, ionization efficiency is a good basis for normalizationbetween spectra. However, we posit that TIC is not a good measure forionization efficiency. The reason is that ionization efficiency shouldbe compared and rectified on the basis of the parts that are commonbetween two spectra. Not on the basis of the parts that aredifferential, and the TIC cannot tell these two apart. In short, themore similar the content between samples, the better the TIC-basedscaling. The more dissimilar the samples, the greater the bias. Instudies that have a substantial amount of differential peaks from onesample to another, the bias of TIC-based methods can become troublesome.Particularly imaging mass spectrometry is vulnerable as these data setstypically contain spectra from a wide range of different cell types andanatomical regions.

Novel Approach. Ionization Efficiency Correction, of the PresentInvention:

Given the problem with TIC-based methods highlighted in the previoussection, we formulate a new normalization approach, named ionizationefficiency correction (IEC).

The ionization efficiency provides clues towards projecting peak heightintensities from different spectra onto a common scale. The differencewith TIC-based methods lays in the fact that only ion counts fromanalytes common to all spectra are used to calculate the normalization.The participation of differential peaks in the scaling factorcalculation is minimized. As we will demonstrate, TIC-based methodscannot tell the difference between common and differential content,while IEC can.

For reasons of clarity, let us define two additional concepts. Thecommon ion current (CIC) of a mass spectrum is the sum of all ion countsbelonging to the part of the mass spectrum that is common in relativeprofile to other mass spectra in a data set.

The differential ion current (DIC) of a mass spectrum is the sum of allion counts belonging to the part of the mass spectrum that is not commonin relative profile to other mass spectra in a data set. The TIC of aspectrum is the sum of its CIC and its DIC.

Ionization efficiency correction is a three-step normalization process:

1. Separate the part common to all spectra from the parts that aredifferential.

2. For each mass spectrum, calculate its CIC.

3. For each mass spectrum, scale back the mass spectrum with the inverseof its CIC.

Step two and three are similar to the operations applied in a TIC-basedalgorithm. The difference is that the traditional TIC is replaced by theCIC value, which is a better estimate of ionization efficiency. The cruxof the method lies in obtaining the estimate for the CIC, which is theresponsibility of step one. Given a data set of multiple mass spectra,the task of step one is to identify which parts of the relative profilesof all these spectra are commonly found across the entire data set. Ifsuch a common relative expression profile can be found for all spectrawith an individual scaling factor for each mass spectrum, the CIC of aspectrum is simply the area-under-the-curve of the scaled commonprofile.

The task of extracting from a collection of mass spectra a singlepseudo-mass spectrum that only contains common ion peaks and relativepeak heights can be approached in a number of different ways. Consideredfrom a linear algebra perspective, the problem of step one can betranslated to the mathematical domain in terms of the search for arank-1 approximation of the two-mode array or matrix containing all themass spectra (Notice that the mathematical definition of a matrixapplies here. This concept has no relation to the chemical matrix inwhich analytes are embedded for MALDI measurements).

A rank-1 approximation is a concept often used in the context of matrixdecomposition methods. The goal of a rank-1 approximation is toapproximate a matrix with the product of two vectors, as depicted inFIG. 1.2 . In addition to looking for a rank-1 approximation, the searchshould be optimized towards avoiding the inclusion of differentialpeaks. Given a data set of N mass spectra that each contain M m/z bins,the task of step one will be to look for a rank-1 approximation of the NM data matrix, while penalizing differential peaks in the profilevector. This approximation entails a 1 M vector containing the commonmass spectral profile and a N 1 vector containing scaling factors thatwill be used to calculate the CICs. Written as an optimization problem,this becomes:

minimize ||D − sp^(T)||² (1.1) subject to p containing no differentialpeaks where D ∈ R^(NM) (mass spectral data set) s ∈ R^(N) (scalingfactors) p ∈ R^(m) (mass spectral profile)

Within the matrix decomposition field there are several differentmethods that can perform a rank-1 approximation, but most are tweakedtowards optimizing different characteristics of the decomposition.Examples include principal component analysis (PCA) [I. T. Jollijfe,Principal component analysis, 2nd ed. New York: Springer, 2002.[Online]. Available:www.loc.gov/catdir/enhancements/fy0817/2002019560-t.html and M. Ringner,“What is principal component analysis?” Nat Biotechnol, vol 26, no. 3,pp. 303-4, March 2008], independent component analysis (ICA) [J. V.Stone, Independent component analysis: a tutorial introduction.Cambridge, Mass.: MIT Press, 2004], and singular value decomposition(SVD) [B. DeMoor and P. Van Dooren, “Generalizations of the qr and thesingular value decomposition,” SIAM Journal on Matrix Analysis andApplications, vol. 13, no. 4, pp. 993-1014, October 1992 and G. Goluband W. Kahan, “Calculating the singular values and pseudo-inverse of amatrix,” Journal of the Society for Industrial and Applied Mathematics,Series B, vol. 2, no. 2, pp. 205-224, 1965]. Because of the need tominimize differential peaks, none of these algorithms providesout-of-the-box the rank-1 approximation we need.

Empirically however, we have attained good results with a modificationof the non-negative matrix factorization (NMF) algorithm [D. Lee and H.Seung, “Learning the parts of objects by non-negative matrixfactorization,” Nature, vol. 401, no. 6755, pp. 788-791, 1999]. In ourimplementation the basic NMF algorithm is run several times on the dataset in rank-1 mode, but each iteration the differential residuals arededucted from the data set. This approach converges towards a rank-1approximation with little or no remnants of non-common features in theprofile. We use this decomposition algorithm a s the driving forcebehind step one, but it is clear that this phase of the algorithm is aninviting area for further advanced developments in the future.Conceptually, the IEC method can be considered a normalization frameworkin which a particular decomposition engine can be dropped to estimatethe CIC.

Once a pseudo-spectrum of the common mass peaks has been established,accompanied by a scaling factor for each individual mass spectrum, wehave the material necessary to tell the common ion counts and thedifferential ion counts apart. Step two then estimates the CIC of a massspectrum as the area-under-the-curve of the common profile (determinedin step one), scaled by that mass spectrum's individual scaling factor(also determined in step one). Step three scales back the entire massspectrum, not just the common parts, with the inverse of the CIC. Aschematic overview of the TIC-based methodology and the difference withthe newer IEC approach is shown 1.3. In analogy to TIC-based methods,IEC could be labeled a CIC-based method.

One concern regarding IEC might be the following. In a situation whereevery photon fired at the sample is used up in the ionization process toyield an analyte ion, peak heights would drop when additionaldifferential sample content shows up. In such a situation, TIC wouldindeed be equal to ionization efficiency (at constant ionizationenergy), and IEC might be misled. However, such a situation could onlyexist if the transfer from ionization energy to formed ions is 100percent, which is extremely unlikely in real-world cases. Any real-lifesituation that falls short of this Utopian efficiency, can benefit fromthe IEC approach. The following sections will demonstrate the value ofIEC in two distinct case studies.

EXAMPLES Example 1: A Case Study: Synthetic Mass Spectrometry Data Set

The objective of the first case study is to give a concretedemonstration of the problems inherent to TIC-based normalizationmethods, and to quantify the improvements introduced by IECnormalization. A thorough comparison of normalization methods requiresthe availability of a gold standard against which the methods'performance can be measured. As real-world biological case studies canrarely provide sufficient characterization of the ion intensity noise onthe measurements, this first case study will center on a synthetic massspectral data set.

Creation of the Synthetic Data Set: The data set will mimic a typicalexperimental setup aimed at biomarker discovery. The data describes 25individual mass spectral measurements, that are engineered to have bothcommon as well as differential ion peaks. To ensure the authenticity ofthe study, the spectra are generated from a base pattern, which is areal mass spectrum from a profiling study on mouse brain. The peaks fromthe base pattern will be present in all 25 spectra at various peakheights, and will fulfill the role of common pattern across the dataset. Adding differential peaks to the base pattern generates fouradditional classes of content. The added peaks are the mimicked byadding Gaussian distributions of various height and variance to the basepattern. To test the robustness of the algorithms, the shape of theadditional peaks is varied. Pattern one adds only a few slim peaks.Pattern two contains different shapes through a fusion of peaks. Patternthree adds primarily wide peaks of low height, while pattern fourcontains a mixture. All five patterns span a m/z range from 2800 to25000, as depicted in FIG. 1.4 . By adding peaks to the base pattern,all patterns describe an area-under-the-curve or TIC that consists ofboth a common part and a pattern-specific differential part. The commonpart is equal to the base pattern and will give rise to the common ioncurrent or CIC of the spectra generated from these spectra. Theremainder of the TIC will give rise to the differential ion current orDIC of the derived spectra. FIG. 1.5 gives per pattern an overview ofthe proportion CIC versus DIC. The patterns represent five differentcontent classes from which five ‘samples’ per class are added to thedata set. Each pattern gives rise to five separate mass spectra,individually scaled with a noise scaling factor to mimic ion intensitynoise. The noise scaling factors are randomly generated, and includeboth amplified as well as attenuated cases. By performing this scaling,ion intensity noise is added to the data set, and the ion intensityscales of the different mass spectra are dispersed. The goal of thenormalization algorithms will now be to reverse the situation to a levelthat allows for solid relative comparison. Unlike in real-world casestudies, here the noise scaling factors are known to us and can serve asa gold standard for normalization performance. FIG. 1.6 shows the noisyspectra that were generated and their respective scaling factors.

Comparison of Normalization Performance: First, a gold standard fornormalization is established through inversion of the noise scalingfactors that were used to create the data set. Then, both TIC-based andIEC normalization are applied to the noisy data set. The results aresummarized in FIGS. 1.7, 1.8, 1.9, and 1.10 . FIG. 1.6 shows a heat maprepresentation of the noisy spectra.

Notice the ion intensity noise-induced striping of the spectra. Goodnormalization will need to remove that striping effect maximally. FIG.1.8 shows the result after reverse scaling with the real noise factors.It is clear that the striping is gone and the noise is removed. FIG. 1.9shows the results after using the classical TIC-based method. It showsgood performance within content classes, but its normalization acrossdifferent contents is incomplete. There is still definite stripingbetween the spectra that only contain the base pattern and the ones thatcontain differential peaks as well. The heat map shows that the presenceof non-common peaks increases the total ion count of the spectrum, andas a result overestimates the spectrum's ion yield from the sample.

The ramification of this overestimation is an underestimation of thenecessary noise canceling factor. As a result, peak heights are too lowcompared to spectra that contain less differential peaks. FIG. 1.10shows the results from the IEC algorithm. It shows no striping and isvisually indiscernible from the gold standard pattern. Note that in mostof these methods the absolute peak heights are never restored exactly.They only make spectra comparable at a relative level. The heat mapillustrates that by using a rank-1 approximation of the spectra, IEC isable to avoid bias from differential peaks. FIG. 1.11 provides a closerlook at some of the normalization results, by focusing on a zoomedsection of the mass spectrum of sample 6. It shows the gold standard andboth reconstructions for the sixth mass spectrum (with their intensitiesscaled between zero and one to enable direct comparison). The IEC tracesthe profile of the gold standard closely, while the TIC approach clearlyunderestimates the true peak heights. The excellent matchup between IECand the gold standard keeps the gold standard, indicated in blue,largely hidden behind the red line of the IEC result. Only at the verytops of the peaks do the blue tips of the gold standard improve over theIEC result.

Comparison of Normalization Performance with Additive Noise: Todemonstrate the robustness of the approaches, the experiment is repeatedwith additive noise. First, the same multiplicative ion intensity noisefrom the previous run is added. Then, Gaussian additive noise is put ontop of all m/z bins with a standard deviation equal to one percent ofthe noisy data set's intensity range (s.d. of approx. 120 ion counts).The results are shown in FIG. 1.12 (noisy version), FIG. 1.13 (TICnormalized), and FIG. 1.14 . Again the heat maps show that IECoutperforms TIC, even with a significant amount of additional noiseadded on top of the normalization problem. Whether the differencebetween IEC and TIC-based methods is significant, depends on the studyat hand. The answer is tied to elements such as the availableinstrument, the sensitivity required to prove or disprove a hypothesis,and most importantly whether or not the samples in question areheterogeneous in content. The less heterogeneous, the more overlapbetween the spectra and the better TIC will perform. In general however,IEC will outperform TIC-based normalization in most cases because ittakes the ‘common versus differential’ issue out of the hands of theexperimental design, and provides in silico means of compensating forwhatever form the measurements may take. This is an important asset asmore and more studies are collecting ever larger amounts ofmeasurements, cross-comparing spectra from a very wide variety ofbiological origins.

Example 2: A Case Study: Mass Spectral Imaging

Earlier mass spectral imaging was introduced as an area of massspectrometry that is particularly sensitive to ion intensity noise. Inthe case of MALDI based MSI, one reason for this sensitivity is thematrix crystallization required by the ionization method. Also the factthat analytes are measured in situ without first separating themolecules from the surrounding tissue often plays an important role. Theinfluence of the multiplicative ion intensity noise becomes readilyvisible in MSI. The most common use of MSI technology is in fact toproduce ion images that show peak height across an entire organic tissuesection, making the comparability of peak height from one pixel to thenext crucial. Additionally, as computational MSI analysis developsfurther, the influence of reliable peak heights will become even moreimportant than is currently the case with ion images. For this reason,this second case study centers on MSI and takes a closer look at IECperformance on a MSI experiment on mouse brain. A benefit of a MSI casestudy is that it enables intuitive assessment of the normalizationresults through pictures. Unlike the synthetic case study, real-worldexperiments are a noisy business and do not provide a gold standardagainst which to grade the performance of the algorithms. However, a MSIexperiment is one of the few experiment types that provides theopportunity to see in the spatial domain whether the normalizationresults make biological sense. For example, if an anatomical region ismore homogeneously filled or outlined, there is a high probability thatthe algorithm succeeded in extracting more useful information from themeasurements, and thus exhibits better performance. This case studytherefore takes a closer look at ion images to assess TIC versus IECperformance.

The MSI experiment is performed on a sagittal section of mouse brain,using a MALDI mass spectrometer. The data set consists of 1734 massspectra collected from the mouse brain section in a rectangular grid of51 34 pixels. Each mass spectrum captures 6490 m/z bins spanning a rangefrom m/z 2800 to 25000. As the rectangular grid has to circumscribe theentire tissue section, certain mass spectra stem from outside the tissuearea. These are removed from the case study to avoid introducingnon-tissue derived variation into the analysis. After retaining only theon-tissue spectra, the data set consists of 1381 mass spectra that makeup a data matrix of 1381 6490 ion count values. The baseline is removedfrom the analysis to avoid it being an influential factor in theassessment of normalization performance. Similar in approach to thesynthetic case study, we first collect a heat map representation of the1381 spectra in their un-normalized form in FIG. 1.15 . The FIGURE showsonly the highest peaks clearly in some of the rows. The other peakslargely disappear at the lower ion count values. Then, TIC-basednormalization is applied to the data set, resulting in the spectra ofFIG. 1.16 . Compared to FIG. 1.15 , the heat map clearly demonstratesthat normalization is a worthwhile endeavor in MSI. TIC-basednormalization succeeds in pulling several new peaks from themeasurements. A good sign for the reliability of these peaks is thatthey show up consistently across different spectra, appearing asvertical lines in the heat map. Unlike in the synthetic data set wherehorizontal striping in the heat map was used as a clue to point outincomplete normalization, the striping effects in these heat maps have adifferent cause. They have little to do with normalization, but are theresult of pushing measurements that are acquired from a rectangular gridinto a list format. The ‘breaks’ in the vertical lines therefore usuallyoccur at intervals equal to the width of the measurement grid (e.g.roughly every 51 or 34 spectra). Finally, the ionization efficiencycorrection algorithm is applied. The results, shown in FIG. 1.17 ,clearly show that IEC is capable of extracting even more consistentpeaks from the spectra than the TIC-based method could. The lower massrange shows much richer variation, and IEC seems particularly successfulin pulling low intensity peaks over the noise threshold. This propertyof IEC is very interesting as sensitivity is a big topic of concern inMSI. Overall, the observations from the heat maps confirm theconclusions from the synthetic case study: any form of normalization isbetter than none, but the new IEC method does remarkably better than thestandard TIC-based approach.

However, to truly assess the value of IEC for MSI experiments, we needto take a look at the ion images produced from these data sets. Based onthe heat maps from FIGS. 1.15, 1.16, and 1.17 , three ion peaks areselected for comparison in the spatial domain. These ions are m/z 4977,12181, and 18416, and ion images stemming from the un-normalized, theTIC normalized, and the IEC normalized data set are extracted for eachof them. The results are shown in grid form in FIG. 1.18 . To easebiological interpretation they are shown again in FIG. 1.19 ,transparently overlaid on a microscopic image of the mouse brain tissue.Again, the need for ion intensity normalization in MSI is clearlydemonstrated. The un-normalized ion images only make anatomical sensefor the tallest of peaks. Ion m/z 12181 shows little or no structure inthe raw form, while as soon as some form of normalization is applied itclearly shows increased presence in the cerebellar nucleus on the righthand side, and a marked absence from the central hippocampal area.Similar observations can be made for ion m/z 4977 and 18416. Thedifferences between the TIC-based method and IEC are more nuanced.However, they do become abundantly clear when the anatomical backgroundis taken into account. FIG. 1.19 highlights some of the differences witharrows. The general observation seems to be that after performing IEC,the anatomical regions are more homogeneously filled and their outlinesmore clearly traced. For m/z 4977, this means that its presence is morewidely confirmed throughout the upper and lower hippocampal area. Form/z 12181, its absence from the hippocampus is more strongly emphasized(also versus the un-normalized ion image), and its presence in thecerebellar nucleus is more evenly spread. The same is also true form z18416, which shows up in the central white ventricle area and in theelongated corpus callosum that touches it at the top.

CONCLUSION

This document introduced a novel normalization method for use instandard mass spectrometry as well as mass spectral imaging. Theionization efficiency correction method comes closer to the goal ofusing ionization efficiency for normalization purposes than the currentindustry standard based on total ion current. The reason for itsimproved performance lies in its ability to discern in the mass spectracommon peak patterns from differential peak patterns, and to adjust itsscaling factors accordingly.

IEC does this by fusing the chemistry and physics considerations towardsnormalization with the mathematical concepts of matrix decomposition. Itis in this unique fusion of approaches that the novelty of the methodlays. IEC provides both a general framework for normalization and aconcrete implementation of that framework using non-negative matrixfactorization. Further development will particularly focus on improvingthe rank-1 approximation engine of the IEC framework, and to reduce itscomputational requirements.

Other embodiments of the invention will be apparent to those skilled inthe art from consideration of the specification and practice of theinvention disclosed herein.

For instance although originally designed for processing of massspectral data, the IEC method described here can be applied to anymethod generating a “chromatographic” style output (a collection ofpeaks distributed across a x/y scale, were the peak heights areproportional to the concentration/abundance/intensity of the measuredevent). Obvious examples are liquid chromatography (LC), gaschromatography (GC) and densitometric scans.

It is intended that the specification and examples be considered asexemplary only.

Each and every claim is incorporated into the specification as anembodiment of the present invention. Thus, the claims are part of thedescription and are a further description and are in addition to thepreferred embodiments of the present invention.

Each of the claims set out a particular embodiment of the invention.

DRAWING DESCRIPTION Brief Description of the Drawings

The present invention will become more fully understood from thedetailed description given herein below and the accompanying drawingswhich are given by way of illustration only, and thus are not limitativeof the present invention, and wherein:

FIG. 1.1 . is a graphic that provides an example of normalization, (a)Before ion intensity normalization is applied, the peak heights differsubstantially between both spectra, (b) After normalization, the spectraare put on the same intensity scale, enabling direct comparison of peakheight.

FIG. 1.2 . is a schematic diagram showing Rank-1 approximation of amatrix. A rank-1 approximation attempts to decompose the matrix into asingle product of vectors, while potentially optimizing additionalconstraints. In the case of IEC, the matrix consists of mass spectra andthe decomposition produces a common mass spectral profile in one vector(optimized to avoid differential peaks), and a set of scaling factors inthe other vector.

FIG. 1.3 . is a schematic comparison of TIC-based normalization and IECnormalization. Notice the extra step in IEC, which employs a matrixdecomposition method to discern common peak patterns (leading to CIC)from differential peak patterns (leading to DIC) in the mass spectra.

FIG. 1.4 . is a graphic providing five content patterns (spectra). Thesefive patterns form the basis for the 25 individual mass spectra thatmake up the synthetic data set. Notice the common peaks, equal to thebase pattern, and the pattern-specific differential peaks added to thesecond half of the mass range.

FIG. 1.5 . is a graphic that displays the common and differentialpercentages of the total ion currents. The base pattern consistscompletely of peaks common to all spectra. The additional patterns 1through 4 contain respectively 64, 61, 55, and 55 percent common ioncounts.

FIG. 1.6 . is a graphic that provides the noisy data set of 25 spectra.The legend indicates the pattern from which the spectrum stems and withwhat scaling factor the spectrum was perturbed.

FIG. 1.7 . provides an image of an heat map of the spectra with ionintensity noise. The attenuations and amplifications of the rows areclearly visible, and make direct comparison of peak height from onesample to another impossible. The multiplicative noise gives a falsesense of variation in quantity. In reality, every peak stems from thesame ‘amount’ of ions in its pattern.

FIG. 1.8 . provides an image of an heat map of the spectra withoutnoise. This is the result of reverse scaling using the known noisefactors, and can serve as a gold standard for the normalization process.

FIG. 1.9 . provides an image of a heat map of the TIC normalizedspectra. Notice that samples 1 through 5 are correctly normalizedrelative to each other. These stem from the base pattern and their TICis equal to their CIC, explaining why TIC based normalization performswell in these cases. However, samples 6 through 25 show a clearspectrum-wide peak height difference when compared to the first fivesamples. This striping effect is artificial peak height variationintroduced by the TIC-based algorithm, and could be mistaken for genuinebiological variation.

FIG. 1.10 . provides an image of a heat map of the IEC normalizedspectra. Notice the absence of any striping effect, even when comparingspectra with many differential peaks to spectra with little or none. Therelative scaling by IEC restores the pre-noise data set well, and theheat map is visually indiscernible from the gold standard heat map. Notethat normalization does not necessarily restore the absolute peakheight. It is only concerned with relative peak height changes.

FIG. 1.11 . is a graphic that provides a zoomed-in look at thenormalization of the mass spectrum from sample 6. The TIC-basednormalization (green) clearly underestimates, while the IEC (red) isalmost perfectly matched with the gold standard (blue). Only at the verytips does the gold standard become visible.

FIG. 1.12 . provides an image of an heat map of the spectra with ionintensity noise in the presence of Gaussian additive noise

FIG. 1.13 . provides an image of a heat map of the TIC normalizedspectra in the presence of Gaussian additive noise. Notice the badscaling of sample 12. This is the result of low ion intensity valuesbeing swamped by the additive noise. The TIC-based method is not able todiscern peaks produced by the additive noise from real common lowabundance peaks. The additive noise peaks are in the same relativeintensity range as the real ion peaks, and contribute almost as much tothe TIC as real peaks. As a result, the amount of TIC is overestimated(a large part of it being noise), which leads to an underscaling of thespectrum. FIG. 1.14 . provides an image of a heat map of the IECnormalized spectra in the presence of Gaussian additive noise. IEC doesa better job than the TIC-based method of bringing the spectrum ofsample 12 up to comparable peak heights. Although noise peaks are scaledup as well, it is preferable to have at least the real and common peaksat their correct height. Noise peaks that are scaled upward can alwaysbe removed from the analysis later, by removing peaks that only appearin a single sample.

FIG. 1.15 . provides an image of a heat map of the un-normalized spectrafrom the MSI experiment.

FIG. 1.16 . provides an image of a heat map of the TIC normalizedspectra from the MSI experiment. Notice the increased amount of commonpeaks pulled from the data.

FIG. 1.17 . provides an image of a heat map of the IEC normalizedspectra from the MSI experiment. IEC pulls more consistent peaks fromthe data than the TIC-based method could.

FIG. 1.18 . provides pictures with comparison of normalization resultsfor three separate ion images. The ion images of three different ions,m/z 4977, 12181, and 18416 are shown in three situations: normalized,TIC normalized, and IEC normalized. Notice that IEC succeeds inextracting more biologically relevant structure from the data set thanthe TIC-based method. A version of these images overlaid on amicroscopic image of the tissue is available in FIG. 1.19 .

FIG. 1.19 . provides pictures with comparison of normalization resultsfor three separate ion images, overlayed on a microscopy image of thetissue section to aid biological interpretation. The ion images of threedifferent ions, m/z 4977, 12181, and 18416, are shown for threesituations: un-normalized, TIC normalized, and IEC normalized.Particular areas where IEC outperforms the TIC-based method arehighlighted with an arrow.

The invention claimed is:
 1. An apparatus adapted to separate andquantitatively analyze a test species, the apparatus comprising a massspectrometer including: a detector configured for measuring a testspectrum of said test species, and converting the test spectrum to ionintensity test signals wherein the ion intensity test signals comprise acollection of peaks distributed across a x/y scale; a processor adaptedto receive said ion intensity test signals from said detector and toprocess said ion intensity test signals to remove undesirable variationor noise due to variability of the mass spectrometer before furtherprocessing the test spectrum, wherein the processor is programmed forperforming a normalization preprocessing of the ion intensity testsignals received from said detector wherein the normalizationpreprocessing comprises: a. Providing a data set comprising acomparative spectrum for each of a plurality of comparative speciesrelated to, but different from, said test species; b. Identifying acommon profile comprising individual peaks that are common to allspectra of the combined data set containing the comparative spectra andthe test spectrum; c. For each spectrum, calculating its common ioncurrent (CIC) as a sum of only those ion counts belonging to the part ofthe spectrum that corresponds to the common profile; d. For eachspectrum, scale back the entire spectrum, including portions not foundin the common profile, with the inverse of its CIC or with a CIC-derivedscaling factor in order to remove ion intensity noise and generate foreach spectrum a CIC normalized spectrum having comparable peakintensities.
 2. The apparatus according to claim 1, wherein thenormalization preprocessing comprises e. Generating a mass spectralimage comprising a plurality of spatial locations or pixels configuredas a graphical representation of peak height and/or peak area for eachCIC normalized spectrum, said plurality of pixels corresponding to animage of the test species and representing comparable peak intensitiesfrom one location or pixel to another.
 3. The apparatus according toclaim 1, wherein the processor is programmed for a normalizationpreprocessing of the signals of said apparatus, wherein thenormalization process is without a total ion current (TIC)—basednormalization step to assure that ionization efficiencies are comparedand rectified on the basis of the common profile and not on the basis ofthe parts that are not common between the spectrum and the commonprofile.
 4. The apparatus according to claim 1, wherein thenormalization is an ionization efficiency correction.
 5. The apparatusaccording to claim 1, wherein the processor is adapted for processingthe signals into information that demonstrates relative current producedby ions in relation to varying mass/charge ratios.
 6. The apparatusaccording to claim 1, wherein the detector comprises an electronicdetection means for ion detection and further comprises a means fordesorption or vaporization, an ionization means and an ion accelerationmeans with ion separation or deflection means to separate ions, forinstance according to their mass and charge.
 7. The apparatus accordingto claim 1, wherein the detector comprises: 1) an ion source forionizing a specimen to generate ions; 2) an ion sorting means, the socalled mass or ion mobility analyzer, responsible for sorting andseparating ions according to their mass and charge or their mobility,which comprises an ion transport portion for transporting the ions witha mass or mobility selection and/or analyzing means for computation ofthe m/z or mobility ratios based on the detailed motion of the ionspassing through the field; and 3) an ion detector for recording eithercharge induced or current produced when an ion passes by or hits asurface.
 8. The apparatus according to claim 1, with a display means todisplay relative abundance or intensity of ion with a specificmass-to-charge ratio (m/z) in peaks on a graphic.
 9. The apparatusaccording to claim 1, wherein the common profile is determined by adecomposition algorithm that extracts from the data set a plurality ofcommon ion peaks and relative peak heights and/or relative peak areas.10. The apparatus according to claim 1, wherein the combined data setcomprises N spectra that each contain M m/z bins, and the normalizationpreprocessing further comprises: identifying a rank-1 approximation ofthe two-mode array or matrix containing all the spectra by organizing arank-1 approximation of the N×M data matrix, while penalizingdifferential peaks in the profile vector, wherein the common profile isgenerated as a 1×M vector containing the common spectral profile and aN×1 vector containing scaling factors; and utilizing the scaling factoras an estimate for the common ion current (CIC) for each spectrum. 11.The apparatus according to claim 10, wherein the rank-1 approximation isidentified by running a non-negative matrix factorization (NMF)algorithm on the combined data set in rank-1 mode multiple times,wherein the differential residuals are deducted from the data set ineach iteration.
 12. A method of diagnosing of a disorder or biologicalabnormality, wherein the method comprises processing of a plurality ofvariables obtainable from assaying of spectroscopic images or profilesof a test species obtained from a patient, wherein the method comprisesnormalization preprocessing of signals obtained from a mass spectrometerwherein the signals comprise a collection of peaks distributed across ax/y scale, wherein the normalization preprocessing comprises the stepsof: a. Providing a data set comprising a comparative spectrum for eachof a plurality of comparative species related to, but different from,said test species; b. Identifying a common profile comprising individualpeaks that are common to all spectra of the combined data set containingthe comparative spectra and the test spectrum; c. For each spectrum,calculating its common ion current (CIC) as a sum of only thosespectroscopic counts belonging to the part of the spectrum thatcorresponds to the common profile; d. For each spectrum, scale back theentire spectrum, including portions not found in the common profile,with the inverse of its CIC or with a CIC-derived scaling factor. 13.The method according to claim 12, wherein the normalizationpreprocessing comprises: e. Generating a mass spectral image comprisinga plurality of spatial locations or pixels configured as a graphicalrepresentation of peak height and/or peak area for each CIC normalizedspectrum said plurality of pixels corresponding to an image of the testspecies and representing comparable peak intensities from one locationor pixel to another.
 14. An operating system for operating the methodaccording to claim 12 which controls the allocation of an assay systemto generate biomarker values of a patient and which feeds the inputsignals from the assay system into a signal processor comprising amathematical model that is described on the relationship of a pluralityof biomarker variables and a plurality of disorder variables fromassaying of biological samples of a plurality of patients with nodisorder, affected with disorder, affected with a defined seriousness orwith defined progress of disorder.
 15. The operating system according toclaim 14 for determining the presence or absence of disorder, theseriousness of disorder or the progress of disorder in the patient. 16.The operating system according to claim 14, wherein the operating systemincludes a user interface to enable the user to interact with thefunctionality of the computer.
 17. The operating system according toclaim 14, wherein the operating system includes a graphical userinterface whereby the operating system controls the ability to generategraphics on the computer's display device that can be displayed in avariety of manners representative for or associated with the conditionof disorder in a selected patient or a group of patients to allow a userto distinguish between the absence of disorder, the seriousness ofdisorder or the progress of disorder in identified patients or patientgroups.
 18. A mass-spectrometer implemented method for producing anormalized mass spectrum, comprising the steps of: generating, with anion source and a detector, a plurality of test spectra by ionizing atest species, and converting the test spectra to ion intensity testsignals wherein the ion intensity test signals comprise a collection ofpeaks distributed across a x/y scale; comparing, with a processor on apeak-by-peak basis, each peak of each of the plurality of test spectrato produce a common profile comprising individual peaks that are commonto all spectra of the plurality of test spectra; generating, with aprocessor, for each of the plurality of test spectra a normalizedspectrum by scaling back each of the plurality of test spectra with theinverse of the sum of those ion counts belonging to the test spectrumthat correspond to the common profile or with a factor derived from theinverse of the sum of those ion counts belonging to a part of the testspectrum that correspond to the common profile.
 19. Themass-spectrometer implemented method according to claim 18, wherein, foreach of a resulting plurality of normalized spectra, storing a falsecolor representation of peak height and/or peak area as a pixelcorresponding to an image of the test species.